Goto

Collaborating Authors

 video source


Automated Detection of Sport Highlights from Audio and Video Sources

Della Santa, Francesco, Lalli, Morgana

arXiv.org Artificial Intelligence

This study presents a novel Deep Learning-based and lightweight approach for the automated detection of sports highlights (HLs) from audio and video sources. HL detection is a key task in sports video analysis, traditionally requiring significant human effort. Our solution leverages Deep Learning (DL) models trained on relatively small datasets of audio Mel-spectrograms and grayscale video frames, achieving promising accuracy rates of 89% and 83% for audio and video detection, respectively. The use of small datasets, combined with simple architectures, demonstrates the practicality of our method for fast and cost-effective deployment. Furthermore, an ensemble model combining both modalities shows improved robustness against false positives and false negatives. The proposed methodology offers a scalable solution for automated HL detection across various types of sports video content, reducing the need for manual intervention. Future work will focus on enhancing model architectures and extending this approach to broader scene-detection tasks in media analysis.


ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark

Dang, Ronghao, Yuan, Yuqian, Zhang, Wenqi, Xin, Yifei, Zhang, Boqiang, Li, Long, Wang, Liuyi, Zeng, Qinyang, Li, Xin, Bing, Lidong

arXiv.org Artificial Intelligence

The enhancement of generalization in robots by large vision-language models (LVLMs) is increasingly evident. Therefore, the embodied cognitive abilities of LVLMs based on egocentric videos are of great interest. However, current datasets for embodied video question answering lack comprehensive and systematic evaluation frameworks. Critical embodied cognitive issues, such as robotic self-cognition, dynamic scene perception, and hallucination, are rarely addressed. To tackle these challenges, we propose ECBench, a high-quality benchmark designed to systematically evaluate the embodied cognitive abilities of LVLMs. ECBench features a diverse range of scene video sources, open and varied question formats, and 30 dimensions of embodied cognition. To ensure quality, balance, and high visual dependence, ECBench uses class-independent meticulous human annotation and multi-round question screening strategies. Additionally, we introduce ECEval, a comprehensive evaluation system that ensures the fairness and rationality of the indicators. Utilizing ECBench, we conduct extensive evaluations of proprietary, open-source, and task-specific LVLMs. ECBench is pivotal in advancing the embodied cognitive capabilities of LVLMs, laying a solid foundation for developing reliable core models for embodied agents. All data and code are available at https://github.com/Rh-Dang/ECBench.


SALOVA: Segment-Augmented Long Video Assistant for Targeted Retrieval and Routing in Long-Form Video Analysis

Kim, Junho, Kim, Hyunjun, Lee, Hosu, Ro, Yong Man

arXiv.org Artificial Intelligence

Despite advances in Large Multi-modal Models, applying them to long and untrimmed video content remains challenging due to limitations in context length and substantial memory overhead. These constraints often lead to significant information loss and reduced relevance in the model responses. With the exponential growth of video data across web platforms, understanding long-form video is crucial for advancing generalized intelligence. In this paper, we introduce SALOVA: Segment-Augmented LOng Video Assistant, a novel video-LLM framework designed to enhance the comprehension of lengthy video content through targeted retrieval process. We address two main challenges to achieve it: (i) We present the SceneWalk dataset, a high-quality collection of 87.8K long videos, each densely captioned at the segment level to enable models to capture scene continuity and maintain rich descriptive context. (ii) We develop robust architectural designs integrating dynamic routing mechanism and spatio-temporal projector to efficiently retrieve and process relevant video segments based on user queries. Our framework mitigates the limitations of current video-LMMs by allowing for precise identification and retrieval of relevant video segments in response to queries, thereby improving the contextual relevance of the generated responses. Through extensive experiments, SALOVA demonstrates enhanced capability in processing complex long-form videos, showing significant capability to maintain contextual integrity across extended sequences.


LG SP9YA soundbar review: This 5.1.2 speaker gets its surround effects from the sides

PCWorld

That's the question LG's SP9YA poses, which comes labeled as a 5.1.2 The SP9YA isn't the first soundbar to attempt this trick; the Creative SXFI Carrier tries something similar, and with similarly mixed results. The SP9YA also packs an array of impressive features, including AI-powered room correction, eARC, built-in AirPlay 2 and Chromecast, as well as support for Alexa speaker groups and Spotify Connect. This review is part of TechHive's coverage of the best soundbars. Click that link to read reviews of competing products, along with a buyer's guide to the features you should consider when shopping.


Vizio M-Series M512a-H6 review: This mid-range soundbar delivers big, dynamic sound

PCWorld

What Vizio's mid-range M512a-H6 lacks in Wi-Fi connectivity, it makes up for in big, exciting, room-filling sound. Slated to ship in July for a list price of $450, this 5.1.2-channel M-series soundbar from Vizio is easy to set up, offers plenty of discrete audio adjustments, and delivers immersive Dolby Atmos and DTS:X sound courtesy of upfiring drivers. Now, a sub-$500 soundbar like the M512a-H6 (which Vizio calls an "M-series" soundbar, sitting between its high-end P-series and budget-priced V-series models) will necessarily mean settling for some compromises--in this case, no Wi-Fi support, which means you'll have to do without AirPlay 2 and Chromecast functionality, as well as support for native audio streaming. The good news is that you can add a voice assistant by connecting a smart speaker via a 3.5mm jack or Bluetooth, a nifty feature that's new to Vizio's 2021 soundbars.


Polk Audio MagniFi 2 soundbar review: Virtual 3D audio and built-in Chromecast, but iffy bass

PCWorld

Polk Audio manages to tease some relatively impressive virtual 3D audio out of its 2.1-channel MagniFi soundbar, which makes the speaker's subpar bass response all the more disappointing. Equipped with built-in Chromecast and Google Assistant support, the MagniFi 2 is easy to set up, and Polk Audio's custom digital sound processing delivers subtle surround and height effects without undue harshness. The $499 MagniFi 2 also comes with three HDMI inputs, a pleasant surprise for a soundbar in this price range. But while it's unquestionably an upgrade over standard TV speakers, the MagniFi 2's otherwise crisp audio is undermined by muddy bass from the wireless subwoofer, robbing the sound of punchiness. Polk Audio has three lines of soundbars.


Yamaha MusicCast BAR 400 review: A $500 soundbar with multi-room audio, but no Dolby Atmos

PCWorld

The two-year-old Yamaha BAR 400 is one of the least expensive soundbars around to offer high-resolution multi-room audio support, but you'll need to sacrifice other features--such as Dolby Atmos and a center channel--in the bargain. This 2.1-channel model boasts support for Yamaha's robust MultiCast multi-room audio platform and Apple's AirPlay 2, and it serves up solid 2D movie audio and top-notch music performance. But the $500 MusicCast BAR 400 lacks native support for Dolby Atmos and DTS:X support, the two leading 3D audio formats that are fast becoming de rigueur in this price range, and its DTS Virtual:X mode sounds too harsh to be a viable substitute. With its $500 price tag and support for Yamaha's high-resolution MusicCast multi-room audio system, the two-year-old Yamaha MusicCast BAR 400 is something of a throwback in Yamaha's soundbar lineup. In the past couple of years, Yamaha has focused more on budget-priced DTS Virtual:X soundbars (think $350 or less), none of which support MusicCast.


Coronavirus Fragments 15: Medical Precrime and the Hackable Brain

#artificialintelligence

Horrifying Glimpse Into How DARPA Will "Save" You From COVID-19 and Venezuela Coup Tied Back To Trump (7 May 2020). In my last two posts, The New World Emperor and Wake Up, You're Next, I stated that the main worry in the nCov pandemic is not just the virus - its origins, seriousness, the number of strains, and their forthcoming spread - but how the pandemic will be controlled. I argued that mass vaccines and tracking will involve the transition from computer-based to human-based operating systems. A series of pandemic outbreaks now and in coming years will be followed by successive vaccines, which will implant weaponized AI and nanotechnology on a mass scale, in order to establish brain-machine interfaces around the globe, paired with a cryptocurrency as a reward or punishment system. If you accept this technology into your body, the control of the few over the many will be complete, and the Internet of Thoughts will be born. To understand injectable technologies, see (above) The Last American Vagabond's 7 May 2020 interview with independent journalist, Whitney Webb.


Materialist Rhetoric: The Cosmic Indifference of Alien AI Gods

#artificialintelligence

This week, I am selecting vocabulary used by materialists to reveal their long game. Materialism here refers loosely to those who subscribe to secular and scientific values, that is, those who explore the world through the strictly physical. I deconstruct some materialist values here, not to write an anti-scientific or technophobic screed, but to consider alternative approaches toward science and technology. I mean to show here that what we think of as'rationalism,' 'secularism,' 'materialism,' 'science,' and'technology' are social constructions, grounded in obsolete, 19th and 20th century modernist concepts of order and madness. I will discuss this historical background and alternatives to these concepts in future posts.


How AI Can Supercharge Customer Service - ITChronicles

#artificialintelligence

As such, it should be no surprise that many forward-thinking companies are already experimenting with artificial intelligence (AI) technology to improve their processes and service their customers better. Primarily, AI is currently being deployed in customer service as a means to either augment, or in some cases replace human agents. The primary goals of these initiatives are to improve the customer experience, and reduce costs associated with human service agents. While it is of course true that AI and automation technologies are not yet sophisticated enough to perform all of the tasks currently undertaken by human representatives, many routine consumer requests are simple enough for AI to handle without human input. Perhaps more importantly, AI can deliver a level of responsiveness to an influx of customer requests that isn't humanly possible – or at least isn't possible without spending a fortune on staffing.